Take-home Exercise 3: Provincial Competitiveness Index influence on FDI in Vietnam

Author

kai feng

Published

October 15, 2024

Modified

November 11, 2024

Introduction

Provincial Competitiveness Index in Vietnam

Context: Vietnam’s provinces vary significantly in competitiveness, as captured by the Provincial Competitiveness Index (PCI). This index evaluates key dimensions such as entry costs, land access, transparency, and labor policies, which influence the investment climate and economic potential of each region.

Challenges: Provinces aiming to attract investment face challenges related to regional disparities and governance effectiveness. Understanding PCI dimensions is essential for identifying strengths and areas for improvement.

Analysis Focus

Objectives: This analysis aims to evaluate PCI dimensions through linear regression, examining their correlation with FDI projects and FDI registered capital inflow to identify combinations that drive investment.

Goals:

  • Identify Key Factors: Determine which PCI dimensions most influence FDI total projects and FDI total registered capital.

  • Province-Specific Insights: Highlight PCI factors lacking in specific provinces to guide policymaking.

  • Actionable Recommendations: Provide targeted suggestions for enhancing PCI dimensions to improve the investment climate.

Significance

This project will analyze how PCI dimensions affect Vietnam’s economic landscape, offering actionable insights to help policymakers enhance regional competitiveness and stimulate sustainable development.



1.0 Setup

1.1 Installing R-Packages

  • sf:

    • For handling spatial vector data and transforming it into simple features (sf) objects.

    • Functions like st_read() for importing spatial data and st_transform() for coordinate reference system transformations.

  • tidyverse: For data manipulation and transformation, including functions for working with tibble data frames.

  • readr: For reading in CSV or other text-based data files.

  • openxlsx, readxl: For reading or exporting in XLSX

  • dplyr: provide data manipulation capabilities (eg. to group and summarize the relationships between these columns)

  • knitr, gtsummary: For styling table
  • tmap: For creating thematic maps

  • animation, png, magick: For animation work

  • sfdep: For performing both local and global spatial autocorrelation analysis
  • ggstatsplot: to visualize relationships with statistical details
  • olsrr: R package for building OLS and performing diagnostics test
  • performance: to visually compare between models
  • GWmodel
pacman::p_load(tidyverse, sf, readr, tmap, dplyr, knitr, animation, png, magick, openxlsx, readxl, sfdep, ggstatsplot, olsrr, performance, gtsummary, GWmodel)


1.2 Data Acquisition

We will be using these dataset:

  • Source: Vietnam Statistics Office , Provincial Competitiveness Index

  • Provincial Competitiveness Index (PCI): To evaluate the competitive environment of each province, identifying strengths and weaknesses that influence investment potential.

  • Foreign Direct Investment (FDI): To assess the attractiveness of provinces for foreign investors and identify trends in investment across different sectors.


1.3 Data Preparation and Wrangling

provincial_boundaries <- st_read(dsn = "data/boundaries/provincial", layer="geoBoundaries-VNM-ADM1")
class(provincial_boundaries)
st_crs(provincial_boundaries)

provincial_boundaries <- provincial_boundaries %>%
  st_transform(crs = 3405) # Transform coordinate

# Drop & Rename column
provincial_boundaries <- provincial_boundaries %>% 
  select(shapeName, shapeISO, shapeGroup, geometry) %>% 
  rename(
    province_vn = shapeName,
    province_code = shapeISO,
    country_code = shapeGroup
  )

# Create a new column 'province_en' based on 'province_code'
provincial_boundaries <- provincial_boundaries %>%
  mutate(province_en = case_when(
    province_code == "VN-44" ~ "An Giang",
    province_code == "VN-43" ~ "BRVT",
    province_code == "VN-54" ~ "Bac Giang",
    province_code == "VN-53" ~ "Bac Kan",
    province_code == "VN-55" ~ "Bac Lieu",
    province_code == "VN-56" ~ "Bac Ninh",
    province_code == "VN-50" ~ "Ben Tre",
    province_code == "VN-31" ~ "Binh Dinh",
    province_code == "VN-57" ~ "Binh Duong",
    province_code == "VN-58" ~ "Binh Phuoc",
    province_code == "VN-40" ~ "Binh Thuan",
    province_code == "VN-59" ~ "Ca Mau",
    province_code == "VN-CT" ~ "Can Tho",
    province_code == "VN-04" ~ "Cao Bang",
    province_code == "VN-DN" ~ "Da Nang",
    province_code == "VN-33" ~ "Dak Lak",
    province_code == "VN-72" ~ "Dak Nong",
    province_code == "VN-71" ~ "Dien Bien",
    province_code == "VN-39" ~ "Dong Nai",
    province_code == "VN-45" ~ "Dong Thap",
    province_code == "VN-30" ~ "Gia Lai",
    province_code == "VN-SG" ~ "HCMC",
    province_code == "VN-03" ~ "Ha Giang",
    province_code == "VN-63" ~ "Ha Nam",
    province_code == "VN-HN" ~ "Ha Noi",
    province_code == "VN-23" ~ "Ha Tinh",
    province_code == "VN-61" ~ "Hai Duong",
    province_code == "VN-HP" ~ "Hai Phong",
    province_code == "VN-73" ~ "Hau Giang",
    province_code == "VN-14" ~ "Hoa Binh",
    province_code == "VN-66" ~ "Hung Yen",
    province_code == "VN-34" ~ "Khanh Hoa",
    province_code == "VN-47" ~ "Kien Giang",
    province_code == "VN-28" ~ "Kon Tum",
    province_code == "VN-01" ~ "Lai Chau",
    province_code == "VN-35" ~ "Lam Dong",
    province_code == "VN-09" ~ "Lang Son",
    province_code == "VN-02" ~ "Lao Cai",
    province_code == "VN-41" ~ "Long An",
    province_code == "VN-67" ~ "Nam Dinh",
    province_code == "VN-22" ~ "Nghe An",
    province_code == "VN-18" ~ "Ninh Binh",
    province_code == "VN-36" ~ "Ninh Thuan",
    province_code == "VN-68" ~ "Phu Tho",
    province_code == "VN-32" ~ "Phu Yen",
    province_code == "VN-24" ~ "Quang Binh",
    province_code == "VN-27" ~ "Quang Nam",
    province_code == "VN-29" ~ "Quang Ngai",
    province_code == "VN-13" ~ "Quang Ninh",
    province_code == "VN-25" ~ "Quang Tri",
    province_code == "VN-52" ~ "Soc Trang",
    province_code == "VN-05" ~ "Son La",
    province_code == "VN-26" ~ "TT-Hue",
    province_code == "VN-37" ~ "Tay Ninh",
    province_code == "VN-20" ~ "Thai Binh",
    province_code == "VN-69" ~ "Thai Nguyen",
    province_code == "VN-21" ~ "Thanh Hoa",
    province_code == "VN-46" ~ "Tien Giang",
    province_code == "VN-51" ~ "Tra Vinh",
    province_code == "VN-07" ~ "Tuyen Quang",
    province_code == "VN-49" ~ "Vinh Long",
    province_code == "VN-70" ~ "Vinh Phuc",
    province_code == "VN-06" ~ "Yen Bai"
  )) %>% 
  select (province_en, everything())

write_rds(provincial_boundaries, "data/rds/provincial_boundaries.rds")
Note

Since Coordinate Reference System of provincial_boundaries

is in 4326 (unit of measurement = degree), we have to transform it

Also, we need to have an english name for each province to allow us to map the province boundary with other dataset

pci_2021 <- read_xlsx("data/provincial_competitiveness_index/2021.xlsx")

pci_2021 <- pci_2021 %>%
  mutate(
    province_code = case_when(
      province_en == "An Giang" ~ "VN-44",
      province_en == "BRVT" ~ "VN-43",
      province_en == "Bac Giang" ~ "VN-54",
      province_en == "Bac Kan" ~ "VN-53",
      province_en == "Bac Lieu" ~ "VN-55",
      province_en == "Bac Ninh" ~ "VN-56",
      province_en == "Ben Tre" ~ "VN-50",
      province_en == "Binh Dinh" ~ "VN-31",
      province_en == "Binh Duong" ~ "VN-57",
      province_en == "Binh Phuoc" ~ "VN-58",
      province_en == "Binh Thuan" ~ "VN-40",
      province_en == "Ca Mau" ~ "VN-59",
      province_en == "Can Tho" ~ "VN-CT",
      province_en == "Cao Bang" ~ "VN-04",
      province_en == "Da Nang" ~ "VN-DN",
      province_en == "Dak Lak" ~ "VN-33",
      province_en == "Dak Nong" ~ "VN-72",
      province_en == "Dien Bien" ~ "VN-71",
      province_en == "Dong Nai" ~ "VN-39",
      province_en == "Dong Thap" ~ "VN-45",
      province_en == "Gia Lai" ~ "VN-30",
      province_en == "HCMC" ~ "VN-SG",
      province_en == "Ha Giang" ~ "VN-03",
      province_en == "Ha Nam" ~ "VN-63",
      province_en == "Ha Noi" ~ "VN-HN",
      province_en == "Ha Tinh" ~ "VN-23",
      province_en == "Hai Duong" ~ "VN-61",
      province_en == "Hai Phong" ~ "VN-HP",
      province_en == "Hau Giang" ~ "VN-73",
      province_en == "Hoa Binh" ~ "VN-14",
      province_en == "Hung Yen" ~ "VN-66",
      province_en == "Khanh Hoa" ~ "VN-34",
      province_en == "Kien Giang" ~ "VN-47",
      province_en == "Kon Tum" ~ "VN-28",
      province_en == "Lai Chau" ~ "VN-01",
      province_en == "Lam Dong" ~ "VN-35",
      province_en == "Lang Son" ~ "VN-09",
      province_en == "Lao Cai" ~ "VN-02",
      province_en == "Long An" ~ "VN-41",
      province_en == "Nam Dinh" ~ "VN-67",
      province_en == "Nghe An" ~ "VN-22",
      province_en == "Ninh Binh" ~ "VN-18",
      province_en == "Ninh Thuan" ~ "VN-36",
      province_en == "Phu Tho" ~ "VN-68",
      province_en == "Phu Yen" ~ "VN-32",
      province_en == "Quang Binh" ~ "VN-24",
      province_en == "Quang Nam" ~ "VN-27",
      province_en == "Quang Ngai" ~ "VN-29",
      province_en == "Quang Ninh" ~ "VN-13",
      province_en == "Quang Tri" ~ "VN-25",
      province_en == "Soc Trang" ~ "VN-52",
      province_en == "Son La" ~ "VN-05",
      province_en == "TT-Hue" ~ "VN-26",
      province_en == "Tay Ninh" ~ "VN-37",
      province_en == "Thai Binh" ~ "VN-20",
      province_en == "Thai Nguyen" ~ "VN-69",
      province_en == "Thanh Hoa" ~ "VN-21",
      province_en == "Tien Giang" ~ "VN-46",
      province_en == "Tra Vinh" ~ "VN-51",
      province_en == "Tuyen Quang" ~ "VN-07",
      province_en == "Vinh Long" ~ "VN-49",
      province_en == "Vinh Phuc" ~ "VN-70",
      province_en == "Yen Bai" ~ "VN-06",
    )
  ) %>%
  select(province_en, province_code, everything())

write.xlsx(pci_2021, "data/rds/pci_2021.xlsx")




fdi <- read_xlsx("data/fdi.xlsx")
# Rename columns
colnames(fdi) <- c("province_en", "total_project_count", 
                   "total_registered_capital")
# Remove the first row
fdi <- fdi[-c(1, 2), ]

fdi <- fdi %>%
  mutate(
    province_code = case_when(
      province_en == "An Giang" ~ "VN-44",
      province_en == "Ba Ria - Vung Tau" ~ "VN-43",
      province_en == "Bac Giang" ~ "VN-54",
      province_en == "Bac Kan" ~ "VN-53",
      province_en == "Bac Lieu" ~ "VN-55",
      province_en == "Bac Ninh" ~ "VN-56",
      province_en == "Ben Tre" ~ "VN-50",
      province_en == "Binh Dinh" ~ "VN-31",
      province_en == "Binh  Duong" ~ "VN-57",
      province_en == "Binh Phuoc" ~ "VN-58",
      province_en == "Binh Thuan" ~ "VN-40",
      province_en == "Ca Mau" ~ "VN-59",
      province_en == "Can Tho" ~ "VN-CT",
      province_en == "Cao Bang" ~ "VN-04",
      province_en == "Da Nang" ~ "VN-DN",
      province_en == "Dak Lak" ~ "VN-33",
      province_en == "Dak Nong" ~ "VN-72",
      province_en == "Dien Bien" ~ "VN-71",
      province_en == "Dong Nai" ~ "VN-39",
      province_en == "Dong Thap" ~ "VN-45",
      province_en == "Gia Lai" ~ "VN-30",
      province_en == "Ho Chi Minh city" ~ "VN-SG",
      province_en == "Ha Giang" ~ "VN-03",
      province_en == "Ha Nam" ~ "VN-63",
      province_en == "Ha Noi" ~ "VN-HN",
      province_en == "Ha Tinh" ~ "VN-23",
      province_en == "Hai Duong" ~ "VN-61",
      province_en == "Hai Phong" ~ "VN-HP",
      province_en == "Hau Giang" ~ "VN-73",
      province_en == "Hoa Binh" ~ "VN-14",
      province_en == "Hung Yen" ~ "VN-66",
      province_en == "Khanh  Hoa" ~ "VN-34",
      province_en == "Kien  Giang" ~ "VN-47",
      province_en == "Kon Tum" ~ "VN-28",
      province_en == "Lai Chau" ~ "VN-01",
      province_en == "Lam Dong" ~ "VN-35",
      province_en == "Lang Son" ~ "VN-09",
      province_en == "Lao Cai" ~ "VN-02",
      province_en == "Long An" ~ "VN-41",
      province_en == "Nam Dinh" ~ "VN-67",
      province_en == "Nghe An" ~ "VN-22",
      province_en == "Ninh Binh" ~ "VN-18",
      province_en == "Ninh  Thuan" ~ "VN-36",
      province_en == "Phu Tho" ~ "VN-68",
      province_en == "Phu Yen" ~ "VN-32",
      province_en == "Quang Binh" ~ "VN-24",
      province_en == "Quang  Nam" ~ "VN-27",
      province_en == "Quang  Ngai" ~ "VN-29",
      province_en == "Quang Ninh" ~ "VN-13",
      province_en == "Quang Tri" ~ "VN-25",
      province_en == "Soc Trang" ~ "VN-52",
      province_en == "Son La" ~ "VN-05",
      province_en == "Thua Thien-Hue" ~ "VN-26",
      province_en == "Tay Ninh" ~ "VN-37",
      province_en == "Thai Binh" ~ "VN-20",
      province_en == "Thai  Nguyen" ~ "VN-69",
      province_en == "Thanh Hoa" ~ "VN-21",
      province_en == "Tien Giang" ~ "VN-46",
      province_en == "Tra Vinh" ~ "VN-51",
      province_en == "Tuyen Quang" ~ "VN-07",
      province_en == "Vinh Long" ~ "VN-49",
      province_en == "Vinh Phuc" ~ "VN-70",
      province_en == "Yen Bai" ~ "VN-06",
    )
  ) %>%
  select(province_en, province_code, everything())

fdi <- fdi %>% 
  left_join(provincial_boundaries, by = "province_code") %>% 
  select(province_en.x, province_code, total_project_count, total_registered_capital, geometry) %>% 
  rename(province_en = province_en.x)

write_rds(fdi, "data/rds/fdi.rds")
Note

PCI_2021 datasets were inconsistent, so I created a new sheet called ‘summary’ and renamed the old one to ‘summary - old’. The new sheet uses the XLOOKUP function for quick data population from the old sheet, which is much faster compared to handling it in R. In R, different sets of code would be required to manage various data types, making the process more time-consuming.

For economy_pie dataset, we have also performed simple data reformatting shown in ‘Summary’ sheet from ‘Summary -old’ sheet



2.0 Importing the clean set of data

provincial_boundaries <- read_rds("data/rds/provincial_boundaries.rds")

pci_2021 <- read_xlsx("data/rds/pci_2021.xlsx")

fdi <- read_rds("data/rds/fdi.rds")



3.0 Prioritization Analysis for Provincial Development: Identifying Key Predictors

3.1 Correlation Matrix

The PCI consists of nine dimensions, each serving as an independent variable with varying degrees of influence on FDI data.

Given that some dimensions may exhibit high correlation with one another, it is essential to identify these correlated pairs and select only one variable from each pair for analysis.

To achieve this, we conduct a correlation matrix to assess the relationships between the dimensions.

ggcorrmat(pci_2021[,4:13])

Note

Interpretation

If any > 0.8 = highly correlated.

We found there isn’t any pair that is highly correlated. We will later reconfirm with the check for [4.6 Checking for multicollinearity].


3.2 Conduct Linear Regression

To explore the influence of each PCI dimension on FDI, we begin with a linear regression model. This initial model will help us determine the relationship between each independent variable (PCI dimensions) and FDI.

By examining the direction and size of each coefficient, we can start to understand the general influence of each dimension. This setup provides a foundation for refining our analysis and identifying key predictors in subsequent steps

pci_2021 <- pci_2021 %>% 
  left_join(fdi %>% 
              select(province_code, total_project_count, total_registered_capital), 
            by = "province_code")

pci_2021$total_registered_capital <- as.numeric(as.character(pci_2021$total_registered_capital))

pci_2021$total_project_count <- as.numeric(as.character(pci_2021$total_project_count))
pci_project_mlr <- lm(formula = total_project_count ~ `Entry Costs` + 
                  `Land Access` + Transparency + 
                  `Time Costs` + `Informal charges` + Proactivity + 
                  `Business Support Policy` + `Labor Policy` +
                `Law & Order`,
                data=pci_2021)

ols_regress(pci_project_mlr)
                            Model Summary                             
---------------------------------------------------------------------
R                         0.645       RMSE                  1302.611 
R-Squared                 0.416       MSE                2011017.608 
Adj. R-Squared            0.319       Coef. Var              246.439 
Pred R-Squared            0.125       AIC                   1121.656 
MAE                     825.133       SBC                   1145.404 
---------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                 ANOVA                                   
------------------------------------------------------------------------
                     Sum of                                             
                    Squares        DF    Mean Square      F        Sig. 
------------------------------------------------------------------------
Regression     77389878.935         9    8598875.437    4.276     3e-04 
Residual      108594950.815        54    2011017.608                    
Total         185984829.750        63                                   
------------------------------------------------------------------------

                                             Parameter Estimates                                              
-------------------------------------------------------------------------------------------------------------
                    model        Beta    Std. Error    Std. Beta      t        Sig         lower       upper 
-------------------------------------------------------------------------------------------------------------
              (Intercept)     985.260      4334.255                  0.227    0.821    -7704.398    9674.917 
            `Entry Costs`    -578.238       362.995       -0.181    -1.593    0.117    -1305.998     149.523 
            `Land Access`     219.114       451.280        0.061     0.486    0.629     -685.648    1123.877 
             Transparency    -335.325       326.078       -0.123    -1.028    0.308     -989.073     318.422 
             `Time Costs`     459.403       340.015        0.204     1.351    0.182     -222.286    1141.091 
       `Informal charges`    -453.756       386.112       -0.184    -1.175    0.245    -1227.864     320.351 
              Proactivity    -189.001       393.744       -0.064    -0.480    0.633     -978.409     600.407 
`Business Support Policy`     681.895       246.477        0.310     2.767    0.008      187.738    1176.052 
           `Labor Policy`     846.539       279.910        0.361     3.024    0.004      285.354    1407.724 
            `Law & Order`    -632.826       443.014       -0.210    -1.428    0.159    -1521.015     255.363 
-------------------------------------------------------------------------------------------------------------
tbl_regression(pci_project_mlr, 
               intercept = TRUE) %>% 
  add_glance_source_note(
    label = list(sigma ~ "\U03C3"),
    include = c(r.squared, adj.r.squared, 
                AIC, statistic,
                p.value, sigma))
Characteristic Beta 95% CI1 p-value
(Intercept) 985 -7,704, 9,675 0.8
Entry Costs -578 -1,306, 150 0.12
Land Access 219 -686, 1,124 0.6
Transparency -335 -989, 318 0.3
Time Costs 459 -222, 1,141 0.2
Informal charges -454 -1,228, 320 0.2
Proactivity -189 -978, 600 0.6
Business Support Policy 682 188, 1,176 0.008
Labor Policy 847 285, 1,408 0.004
Law & Order -633 -1,521, 255 0.2
R² = 0.416; Adjusted R² = 0.319; AIC = 1,122; Statistic = 4.28; p-value = <0.001; σ = 1,418
1 CI = Confidence Interval
Note

Model Summary

R-Squared of 0.319, indicating that approximately 31.9% of the variation in FDI total number of projects can be accounted for by the independent variables, adjusting for the number of predictors in the model.

ANOVA -Analysis of Variance (F test)

F-ratio of 4.276 -> is significant at p < 0.001. Hence, our regression model is statistically significant, suggesting that at least some fo the PCI dimensions meaningfully contribute to predicting FDI total number of projects

The model summary and ANOVA results reveal that while the overall model has a moderate level of predictive power, with some independent variables (such as Business Support Policy and Labor Policy) showing significant contributions, others (like Entry Costs and Transparency) did not demonstrate strong effects.

pci_capital_mlr <- lm(formula = total_registered_capital ~ `Entry Costs` + 
                  `Land Access` + Transparency + 
                  `Time Costs` + `Informal charges` + Proactivity + 
                  `Business Support Policy` + `Labor Policy` +
                `Law & Order`,
                data=pci_2021)

ols_regress(pci_capital_mlr)
                             Model Summary                              
-----------------------------------------------------------------------
R                          0.739       RMSE                   7912.959 
R-Squared                  0.546       MSE                74210280.110 
Adj. R-Squared             0.470       Coef. Var               117.038 
Pred R-Squared             0.373       AIC                    1352.585 
MAE                     6437.884       SBC                    1376.333 
-----------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                   ANOVA                                    
---------------------------------------------------------------------------
                      Sum of                                               
                     Squares        DF      Mean Square      F        Sig. 
---------------------------------------------------------------------------
Regression    4821217884.446         9    535690876.050    7.219    0.0000 
Residual      4007355125.955        54     74210280.110                    
Total         8828573010.401        63                                     
---------------------------------------------------------------------------

                                               Parameter Estimates                                                
-----------------------------------------------------------------------------------------------------------------
                    model          Beta    Std. Error    Std. Beta      t        Sig          lower        upper 
-----------------------------------------------------------------------------------------------------------------
              (Intercept)    -32220.037     26329.252                 -1.224    0.226    -85007.009    20566.936 
            `Entry Costs`     -3738.724      2205.079       -0.170    -1.696    0.096     -8159.642      682.194 
            `Land Access`      2393.728      2741.389        0.097     0.873    0.386     -3102.425     7889.882 
             Transparency       252.004      1980.823        0.013     0.127    0.899     -3719.308     4223.316 
             `Time Costs`      5067.961      2065.484        0.326     2.454    0.017       926.914     9209.008 
       `Informal charges`     -3828.241      2345.509       -0.226    -1.632    0.108     -8530.704      874.222 
              Proactivity     -2310.748      2391.870       -0.113    -0.966    0.338     -7106.159     2484.663 
`Business Support Policy`      5247.362      1497.272        0.347     3.505    0.001      2245.512     8249.211 
           `Labor Policy`      6748.025      1700.364        0.418     3.969    0.000      3339.000    10157.050 
            `Law & Order`     -3233.972      2691.170       -0.155    -1.202    0.235     -8629.443     2161.499 
-----------------------------------------------------------------------------------------------------------------
tbl_regression(pci_capital_mlr, 
               intercept = TRUE) %>% 
  add_glance_source_note(
    label = list(sigma ~ "\U03C3"),
    include = c(r.squared, adj.r.squared, 
                AIC, statistic,
                p.value, sigma))
Characteristic Beta 95% CI1 p-value
(Intercept) -32,220 -85,007, 20,567 0.2
Entry Costs -3,739 -8,160, 682 0.10
Land Access 2,394 -3,102, 7,890 0.4
Transparency 252 -3,719, 4,223 0.9
Time Costs 5,068 927, 9,209 0.017
Informal charges -3,828 -8,531, 874 0.11
Proactivity -2,311 -7,106, 2,485 0.3
Business Support Policy 5,247 2,246, 8,249 <0.001
Labor Policy 6,748 3,339, 10,157 <0.001
Law & Order -3,234 -8,629, 2,161 0.2
R² = 0.546; Adjusted R² = 0.470; AIC = 1,353; Statistic = 7.22; p-value = <0.001; σ = 8,615
1 CI = Confidence Interval
Note

Model Summary
Adjusted R-Squared of 0.470, indicating that approximately 47.0% of the variation in FDI total registered capital can be accounted for by the independent variables, adjusting for the number of predictors in the model.

ANOVA - Analysis of Variance (F test)
F-ratio of 7.219 -> is significant at p < 0.001. Hence, our regression model is statistically significant, suggesting that at least some of the PCI dimensions meaningfully contribute to predicting FDI total registered capital.

This model summary and ANOVA indicate that the model has strong predictive power for explaining FDI based on the given PCI dimensions. The significance of specific predictors, such as Business Support Policy, Labor Policy, and Time Costs, suggests that these are influential variables in explaining FDI total registered capital.

Summary of Findings

For the total number of FDI projects, the model has an Adjusted R-Squared of 0.319, indicating that approximately 31.9% of the variation can be explained by the independent variables. The ANOVA results show a significant F-ratio of 4.276 (p < 0.001), suggesting that some PCI dimensions significantly contribute to the model. However, not all predictors, such as Entry Costs and Transparency, had strong effects.

In contrast, the model for total registered capital has a higher Adjusted R-Squared of 0.470, indicating that 47.0% of the variation is accounted for by the predictors. The ANOVA shows a significant F-ratio of 7.219 (p < 0.001), confirming the model’s strength. Key variables like Business Support Policy, Labor Policy, and Time Costs are particularly influential in explaining FDI total registered capital.


3.3 Run model to Select Independent variable

I will now run different stepwise regression models to further investigate the specific positive and negative impacts of the independent variables on FDI.

This analysis will allow us to quantify how much a 1-unit increase in each independent variable is expected to influence FDI, providing clearer insights into their contributions to both the total number of projects and total registered capital.

All of the model will be making use of the base model formulated from [4.2 Conduct Linear Regression]

pci_project_fw_mlr <- ols_step_forward_p(
  pci_project_mlr, # this is the model
  p_val = 0.05,
  details = FALSE)

pci_project_fw_mlr

                                      Stepwise Summary                                      
------------------------------------------------------------------------------------------
Step    Variable                       AIC         SBC        SBIC        R2       Adj. R2 
------------------------------------------------------------------------------------------
 0      Base Model                   1138.091    1142.409    955.661    0.00000    0.00000 
 1      `Business Support Policy`    1127.319    1133.796    945.026    0.18091    0.16770 
 2      `Labor Policy`               1121.525    1130.161    939.623    0.27482    0.25105 
 3      `Law & Order`                1117.240    1128.035    936.032    0.34265    0.30979 
------------------------------------------------------------------------------------------

Final Model Output 
------------------

                            Model Summary                             
---------------------------------------------------------------------
R                         0.585       RMSE                  1382.121 
R-Squared                 0.343       MSE                2037609.764 
Adj. R-Squared            0.310       Coef. Var              248.063 
Pred R-Squared            0.162       AIC                   1117.240 
MAE                     837.721       SBC                   1128.035 
---------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                  ANOVA                                    
--------------------------------------------------------------------------
                     Sum of                                               
                    Squares        DF     Mean Square      F         Sig. 
--------------------------------------------------------------------------
Regression     63728243.939         3    21242747.980    10.425    0.0000 
Residual      122256585.811        60     2037609.764                     
Total         185984829.750        63                                     
--------------------------------------------------------------------------

                                             Parameter Estimates                                               
--------------------------------------------------------------------------------------------------------------
                    model         Beta    Std. Error    Std. Beta      t        Sig         lower       upper 
--------------------------------------------------------------------------------------------------------------
              (Intercept)    -3908.420      2946.943                 -1.326    0.190    -9803.183    1986.343 
`Business Support Policy`      752.745       234.838        0.343     3.205    0.002      282.998    1222.492 
           `Labor Policy`      883.432       255.821        0.377     3.453    0.001      371.713    1395.151 
            `Law & Order`     -815.756       327.846       -0.270    -2.488    0.016    -1471.546    -159.967 
--------------------------------------------------------------------------------------------------------------
plot(pci_project_fw_mlr)

# fig-width: 12
# fig-height: 10

pci_project_bw_mlr <- ols_step_backward_p(
  pci_project_mlr, # this is the model
  p_val = 0.05,
  details = FALSE)

pci_project_bw_mlr

                                  Stepwise Summary                                   
-----------------------------------------------------------------------------------
Step    Variable                AIC         SBC        SBIC        R2       Adj. R2 
-----------------------------------------------------------------------------------
 0      Full Model            1121.656    1145.404    943.667    0.41611    0.31879 
 1      Proactivity           1119.929    1141.518    941.482    0.41362    0.32833 
 2      `Land Access`         1118.159    1137.589    939.287    0.41151    0.33794 
 3      `Informal charges`    1117.405    1134.676    937.879    0.39994    0.33677 
 4      `Time Costs`          1116.714    1131.826    936.614    0.38754    0.33474 
 5      Transparency          1116.758    1129.712    936.060    0.36766    0.32478 
 6      `Entry Costs`         1117.240    1128.035    936.032    0.34265    0.30979 
-----------------------------------------------------------------------------------

Final Model Output 
------------------

                            Model Summary                             
---------------------------------------------------------------------
R                         0.585       RMSE                  1382.121 
R-Squared                 0.343       MSE                2037609.764 
Adj. R-Squared            0.310       Coef. Var              248.063 
Pred R-Squared            0.162       AIC                   1117.240 
MAE                     837.721       SBC                   1128.035 
---------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                  ANOVA                                    
--------------------------------------------------------------------------
                     Sum of                                               
                    Squares        DF     Mean Square      F         Sig. 
--------------------------------------------------------------------------
Regression     63728243.939         3    21242747.980    10.425    0.0000 
Residual      122256585.811        60     2037609.764                     
Total         185984829.750        63                                     
--------------------------------------------------------------------------

                                             Parameter Estimates                                               
--------------------------------------------------------------------------------------------------------------
                    model         Beta    Std. Error    Std. Beta      t        Sig         lower       upper 
--------------------------------------------------------------------------------------------------------------
              (Intercept)    -3908.420      2946.943                 -1.326    0.190    -9803.183    1986.343 
`Business Support Policy`      752.745       234.838        0.343     3.205    0.002      282.998    1222.492 
           `Labor Policy`      883.432       255.821        0.377     3.453    0.001      371.713    1395.151 
            `Law & Order`     -815.756       327.846       -0.270    -2.488    0.016    -1471.546    -159.967 
--------------------------------------------------------------------------------------------------------------
plot(pci_project_bw_mlr)

# fig-width: 12
# fig-height: 10

pci_project_sb_mlr <- ols_step_both_p(
  pci_project_mlr, # this is the model
  p_val = 0.05,
  details = FALSE)

pci_project_sb_mlr

                                        Stepwise Summary                                        
----------------------------------------------------------------------------------------------
Step    Variable                           AIC         SBC        SBIC        R2       Adj. R2 
----------------------------------------------------------------------------------------------
 0      Base Model                       1138.091    1142.409    955.661    0.00000    0.00000 
 1      `Business Support Policy` (+)    1127.319    1133.796    945.026    0.18091    0.16770 
 2      `Labor Policy` (+)               1121.525    1130.161    939.623    0.27482    0.25105 
 3      `Law & Order` (+)                1117.240    1128.035    936.032    0.34265    0.30979 
----------------------------------------------------------------------------------------------

Final Model Output 
------------------

                            Model Summary                             
---------------------------------------------------------------------
R                         0.585       RMSE                  1382.121 
R-Squared                 0.343       MSE                2037609.764 
Adj. R-Squared            0.310       Coef. Var              248.063 
Pred R-Squared            0.162       AIC                   1117.240 
MAE                     837.721       SBC                   1128.035 
---------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                  ANOVA                                    
--------------------------------------------------------------------------
                     Sum of                                               
                    Squares        DF     Mean Square      F         Sig. 
--------------------------------------------------------------------------
Regression     63728243.939         3    21242747.980    10.425    0.0000 
Residual      122256585.811        60     2037609.764                     
Total         185984829.750        63                                     
--------------------------------------------------------------------------

                                             Parameter Estimates                                               
--------------------------------------------------------------------------------------------------------------
                    model         Beta    Std. Error    Std. Beta      t        Sig         lower       upper 
--------------------------------------------------------------------------------------------------------------
              (Intercept)    -3908.420      2946.943                 -1.326    0.190    -9803.183    1986.343 
`Business Support Policy`      752.745       234.838        0.343     3.205    0.002      282.998    1222.492 
           `Labor Policy`      883.432       255.821        0.377     3.453    0.001      371.713    1395.151 
            `Law & Order`     -815.756       327.846       -0.270    -2.488    0.016    -1471.546    -159.967 
--------------------------------------------------------------------------------------------------------------
plot(pci_project_sb_mlr)

pci_capital_fw_mlr <- ols_step_forward_p(
  pci_capital_mlr, # this is the model
  p_val = 0.05,
  details = FALSE)

pci_capital_fw_mlr

                                      Stepwise Summary                                       
-------------------------------------------------------------------------------------------
Step    Variable                       AIC         SBC         SBIC        R2       Adj. R2 
-------------------------------------------------------------------------------------------
 0      Base Model                   1385.136    1389.454    1202.161    0.00000    0.00000 
 1      `Business Support Policy`    1367.982    1374.459    1185.110    0.25865    0.24669 
 2      `Labor Policy`               1355.344    1363.979    1173.177    0.41022    0.39089 
 3      `Law & Order`                1351.950    1362.744    1170.264    0.45789    0.43078 
-------------------------------------------------------------------------------------------

Final Model Output 
------------------

                             Model Summary                              
-----------------------------------------------------------------------
R                          0.677       RMSE                   8647.671 
R-Squared                  0.458       MSE                79767687.852 
Adj. R-Squared             0.431       Coef. Var               121.341 
Pred R-Squared             0.371       AIC                    1351.950 
MAE                     6737.299       SBC                    1362.744 
-----------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                    ANOVA                                     
-----------------------------------------------------------------------------
                      Sum of                                                 
                     Squares        DF       Mean Square      F         Sig. 
-----------------------------------------------------------------------------
Regression    4042511739.270         3    1347503913.090    16.893    0.0000 
Residual      4786061271.131        60      79767687.852                     
Total         8828573010.401        63                                       
-----------------------------------------------------------------------------

                                               Parameter Estimates                                                
-----------------------------------------------------------------------------------------------------------------
                    model          Beta    Std. Error    Std. Beta      t        Sig          lower        upper 
-----------------------------------------------------------------------------------------------------------------
              (Intercept)    -44762.474     18438.462                 -2.428    0.018    -81644.889    -7880.059 
`Business Support Policy`      6351.523      1469.340        0.420     4.323    0.000      3412.406     9290.640 
           `Labor Policy`      7262.798      1600.626        0.450     4.537    0.000      4061.069    10464.526 
            `Law & Order`     -4711.518      2051.268       -0.227    -2.297    0.025     -8814.666     -608.370 
-----------------------------------------------------------------------------------------------------------------
plot(pci_capital_fw_mlr)

# fig-width: 12
# fig-height: 10

pci_capital_bw_mlr <- ols_step_backward_p(
  pci_capital_mlr, # this is the model
  p_val = 0.05,
  details = FALSE)

pci_capital_bw_mlr

                                   Stepwise Summary                                   
------------------------------------------------------------------------------------
Step    Variable                AIC         SBC         SBIC        R2       Adj. R2 
------------------------------------------------------------------------------------
 0      Full Model            1352.585    1376.333    1174.596    0.54609    0.47044 
 1      Transparency          1350.604    1372.193    1172.239    0.54596    0.47991 
 2      `Land Access`         1349.491    1368.921    1170.506    0.53962    0.48208 
 3      Proactivity           1348.523    1365.794    1168.952    0.53214    0.48289 
 4      `Informal charges`    1348.492    1363.604    1168.222    0.51752    0.47593 
 5      `Entry Costs`         1350.489    1363.443    1169.336    0.48642    0.45161 
 6      `Time Costs`          1351.950    1362.744    1170.264    0.45789    0.43078 
------------------------------------------------------------------------------------

Final Model Output 
------------------

                             Model Summary                              
-----------------------------------------------------------------------
R                          0.677       RMSE                   8647.671 
R-Squared                  0.458       MSE                79767687.852 
Adj. R-Squared             0.431       Coef. Var               121.341 
Pred R-Squared             0.371       AIC                    1351.950 
MAE                     6737.299       SBC                    1362.744 
-----------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                    ANOVA                                     
-----------------------------------------------------------------------------
                      Sum of                                                 
                     Squares        DF       Mean Square      F         Sig. 
-----------------------------------------------------------------------------
Regression    4042511739.270         3    1347503913.090    16.893    0.0000 
Residual      4786061271.131        60      79767687.852                     
Total         8828573010.401        63                                       
-----------------------------------------------------------------------------

                                               Parameter Estimates                                                
-----------------------------------------------------------------------------------------------------------------
                    model          Beta    Std. Error    Std. Beta      t        Sig          lower        upper 
-----------------------------------------------------------------------------------------------------------------
              (Intercept)    -44762.474     18438.462                 -2.428    0.018    -81644.889    -7880.059 
`Business Support Policy`      6351.523      1469.340        0.420     4.323    0.000      3412.406     9290.640 
           `Labor Policy`      7262.798      1600.626        0.450     4.537    0.000      4061.069    10464.526 
            `Law & Order`     -4711.518      2051.268       -0.227    -2.297    0.025     -8814.666     -608.370 
-----------------------------------------------------------------------------------------------------------------
plot(pci_capital_bw_mlr)

# fig-width: 12
# fig-height: 10

pci_capital_sb_mlr <- ols_step_both_p(
  pci_capital_mlr, # this is the model
  p_val = 0.05,
  details = FALSE)

pci_capital_sb_mlr

                                        Stepwise Summary                                         
-----------------------------------------------------------------------------------------------
Step    Variable                           AIC         SBC         SBIC        R2       Adj. R2 
-----------------------------------------------------------------------------------------------
 0      Base Model                       1385.136    1389.454    1202.161    0.00000    0.00000 
 1      `Business Support Policy` (+)    1367.982    1374.459    1185.110    0.25865    0.24669 
 2      `Labor Policy` (+)               1355.344    1363.979    1173.177    0.41022    0.39089 
 3      `Law & Order` (+)                1351.950    1362.744    1170.264    0.45789    0.43078 
 4      `Time Costs` (+)                 1350.489    1363.443    1169.336    0.48642    0.45161 
 5      `Entry Costs` (+)                1348.492    1363.604    1168.222    0.51752    0.47593 
-----------------------------------------------------------------------------------------------

Final Model Output 
------------------

                             Model Summary                              
-----------------------------------------------------------------------
R                          0.719       RMSE                   8158.221 
R-Squared                  0.518       MSE                73441733.084 
Adj. R-Squared             0.476       Coef. Var               116.430 
Pred R-Squared             0.414       AIC                    1348.492 
MAE                     6522.401       SBC                    1363.604 
-----------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                   ANOVA                                     
----------------------------------------------------------------------------
                      Sum of                                                
                     Squares        DF      Mean Square      F         Sig. 
----------------------------------------------------------------------------
Regression    4568952491.502         5    913790498.300    12.442    0.0000 
Residual      4259620518.899        58     73441733.084                     
Total         8828573010.401        63                                      
----------------------------------------------------------------------------

                                              Parameter Estimates                                                
----------------------------------------------------------------------------------------------------------------
                    model          Beta    Std. Error    Std. Beta      t        Sig          lower       upper 
----------------------------------------------------------------------------------------------------------------
              (Intercept)    -30495.886     19965.179                 -1.527    0.132    -70460.535    9468.762 
`Business Support Policy`      5572.081      1467.377        0.368     3.797    0.000      2634.807    8509.355 
           `Labor Policy`      6085.465      1630.387        0.377     3.733    0.000      2821.892    9349.039 
            `Law & Order`     -4842.805      2152.763       -0.233    -2.250    0.028     -9152.029    -533.581 
             `Time Costs`      3741.369      1713.689        0.241     2.183    0.033       311.048    7171.689 
            `Entry Costs`     -4197.282      2170.954       -0.191    -1.933    0.058     -8542.918     148.354 
----------------------------------------------------------------------------------------------------------------
plot(pci_capital_sb_mlr)


3.4 Model Selection

Next, we will utilize a radar chart to visualize the performance of the different models.

The model that has the most edges touching the outer boundary is considered the best performer, indicating stronger overall results across the evaluated metrics.

project_metric <- compare_performance(pci_project_mlr,
                              pci_project_fw_mlr$model,
                              pci_project_bw_mlr$model,
                              pci_project_sb_mlr$model)
Some of the nested models seem to be identical
project_metric$Name <- gsub(".*\\\\([a-zA-Z0-9_]+)\\\\, \\\\model\\\\.*", "\\1", project_metric$Name)

# plot radar
plot(project_metric)

capital_metric <- compare_performance(pci_capital_mlr,
                              pci_capital_fw_mlr$model,
                              pci_capital_bw_mlr$model,
                              pci_capital_sb_mlr$model)

capital_metric$Name <- gsub(".*\\\\([a-zA-Z0-9_]+)\\\\, \\\\model\\\\.*", "\\1", capital_metric$Name)

# plot radar
plot(capital_metric)

Note

For predicting the total number of projects, the best-performing model is pci_project_sb_mlr).

In contrast, for predicting total registered capital, the best-performing model is pci_capital_sb_mlr.


3.5 Visualize model parameters

We will now utilize the best-performing model to quantify the exact positive or negative impact (in numerical terms) that a one-unit change in the independent variables will have.

ggcoefstats(pci_project_sb_mlr$model,
            sort = "ascending")

ggcoefstats(pci_capital_sb_mlr$model,
            sort = "ascending")

Note

To enhance the attraction of Foreign Direct Investment (FDI) projects and registered capital, policymakers should prioritize improvements in Labor Policy and Business Support Policy.

  • For every single unit increase in Labor Policy, there is a positive influence on attracting more FDI projects and increasing registered capital.

  • Similarly, an increase in Business Support Policy also contributes positively to both the total number of FDI projects and the registered capital.

Focusing on these two policy areas will significantly bolster efforts to attract more FDI.


3.6 Checking for multicollinearity

We will now confirm our Correlation Matrix by looking at the Variance Inflation Factor (VIF)

Interpretation

  • < 5: low multicollinearity

  • 5-10: moderate multcollinearity

  • >10: strong multicollinearity

check_collinearity(pci_project_sb_mlr$model)
# Check for Multicollinearity

Low Correlation

                    Term  VIF    VIF 95% CI Increased SE Tolerance
 Business Support Policy 1.04 [1.00, 12.89]         1.02      0.96
            Labor Policy 1.09 [1.00,  2.65]         1.04      0.92
             Law & Order 1.08 [1.00,  3.13]         1.04      0.93
 Tolerance 95% CI
     [0.08, 1.00]
     [0.38, 1.00]
     [0.32, 1.00]
plot(check_collinearity(pci_project_sb_mlr$model)) +
  theme(axis.text.x = element_text(
    angle = 45, 
    hjust = 1
  ))
Variable `Component` is not in your data frame :/

check_collinearity(pci_capital_sb_mlr$model)
# Check for Multicollinearity

Low Correlation

                    Term  VIF   VIF 95% CI Increased SE Tolerance
 Business Support Policy 1.13 [1.02, 2.03]         1.06      0.89
            Labor Policy 1.23 [1.06, 1.89]         1.11      0.81
             Law & Order 1.29 [1.09, 1.93]         1.13      0.78
              Time Costs 1.46 [1.19, 2.13]         1.21      0.68
             Entry Costs 1.17 [1.03, 1.91]         1.08      0.85
 Tolerance 95% CI
     [0.49, 0.98]
     [0.53, 0.95]
     [0.52, 0.92]
     [0.47, 0.84]
     [0.52, 0.97]
plot(check_collinearity(pci_capital_sb_mlr$model)) +
  theme(axis.text.x = element_text(
    angle = 45, 
    hjust = 1
  ))
Variable `Component` is not in your data frame :/

Note

There is no Multicollinearity found in both the model used for Total Number of projects and Total Registered Capital


3.7 Linearity Assumption Test

project_out <- plot(check_model(pci_project_sb_mlr$model,
                        panel = FALSE))
For confidence bands, please install `qqplotr`.
project_out[[2]]

capital_out <- plot(check_model(pci_capital_sb_mlr$model,
                        panel = FALSE))
For confidence bands, please install `qqplotr`.
capital_out[[2]]


3.8 Normality Assumption Test

plot(check_normality(pci_project_sb_mlr$model))
For confidence bands, please install `qqplotr`.

plot(check_normality(pci_capital_sb_mlr$model))
For confidence bands, please install `qqplotr`.


3.9 Checking of outliers

project_outliers <- check_outliers(pci_project_sb_mlr$model,
                           method = "cook")

project_outliers
1 outlier detected: case 30.
- Based on the following method and threshold: cook (0.849).
- For variable: (Whole model).
plot(project_outliers <- check_outliers(pci_project_sb_mlr$model,
                           method = "cook"))

capital_outliers <- check_outliers(pci_capital_sb_mlr$model,
                           method = "cook")

capital_outliers
OK: No outliers detected.
- Based on the following method and threshold: cook (0.902).
- For variable: (Whole model)
plot(capital_outliers <- check_outliers(pci_capital_sb_mlr$model,
                           method = "cook"))

Note

After conducting the tests, I can conclude that both the models used for the Total Number of Projects and Total Registered Capital meet the necessary assumptions and successfully pass the tests.



4.0 Spatial Non-Stationary Assumption

project_mlr_output <- as.data.frame(pci_project_sb_mlr$model$residuals) %>% 
  rename(`SB_MLR_RES` = `pci_project_sb_mlr$model$residuals`)

# join the newly created data frame
project_fdi_sf <- cbind(provincial_boundaries, 
                        project_mlr_output$SB_MLR_RES) %>%
  rename(`MLR_RES` = `project_mlr_output.SB_MLR_RES`)

tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(provincial_boundaries)+
  tmap_options(check.and.fix = TRUE) +
  tm_polygons(alpha = 0.4) +
tm_shape(project_fdi_sf) +  
  tm_polygons(col = "MLR_RES",
          alpha = 0.6,
          size = 0.3,
          style="quantile") 
Warning: The shape provincial_boundaries is invalid (after reprojection). See
sf::st_is_valid
Warning: The shape project_fdi_sf is invalid (after reprojection). See
sf::st_is_valid
Variable(s) "MLR_RES" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
tmap_mode("plot")
tmap mode set to plotting
# compute the distance-based weight matrix by using dnearneigh() function of spdep.
project_fdi_sf <- project_fdi_sf %>%
  mutate(nb = st_knn(geometry, k=6,
                     longlat = FALSE),
         wt = st_weights(nb,
                         style = "W"),
         .before = 1)
! Polygon provided. Using point on surface.
# global moran_perm 
global_moran_perm(project_fdi_sf$MLR_RES, 
                  project_fdi_sf$nb, 
                  project_fdi_sf$wt, 
                  alternative = "two.sided", 
                  nsim = 999)

    Monte-Carlo simulation of Moran I

data:  x 
weights: listw  
number of simulations + 1: 1000 

statistic = -0.022142, observed rank = 470, p-value = 0.94
alternative hypothesis: two.sided
capital_mlr_output <- as.data.frame(pci_capital_sb_mlr$model$residuals) %>% 
  rename(`SB_MLR_RES` = `pci_capital_sb_mlr$model$residuals`)

# join the newly created data frame
capital_fdi_sf <- cbind(provincial_boundaries, 
                        capital_mlr_output$SB_MLR_RES) %>%
  rename(`MLR_RES` = `capital_mlr_output.SB_MLR_RES`)

tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(provincial_boundaries)+
  tmap_options(check.and.fix = TRUE) +
  tm_polygons(alpha = 0.4) +
tm_shape(capital_fdi_sf) +  
  tm_polygons(col = "MLR_RES",
          alpha = 0.6,
          size = 0.3,
          style="quantile") 
Warning: The shape provincial_boundaries is invalid (after reprojection). See
sf::st_is_valid
Warning: The shape capital_fdi_sf is invalid (after reprojection). See
sf::st_is_valid
Variable(s) "MLR_RES" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
tmap_mode("plot")
tmap mode set to plotting
# compute the distance-based weight matrix by using dnearneigh() function of spdep.
capital_fdi_sf <- capital_fdi_sf %>%
  mutate(nb = st_knn(geometry, k=6,
                     longlat = FALSE),
         wt = st_weights(nb,
                         style = "W"),
         .before = 1)
! Polygon provided. Using point on surface.
# global moran_perm 
global_moran_perm(capital_fdi_sf$MLR_RES, 
                  capital_fdi_sf$nb, 
                  capital_fdi_sf$wt, 
                  alternative = "two.sided", 
                  nsim = 999)

    Monte-Carlo simulation of Moran I

data:  x 
weights: listw  
number of simulations + 1: 1000 

statistic = -0.016383, observed rank = 556, p-value = 0.888
alternative hypothesis: two.sided
Note

Based on the results of the Moran’s I tests, I can conclude that there is no evidence of significant spatial autocorrelation in the data for either model. This suggests that the distribution of variable does not show systematic clustering or dispersion across the studied area.



5.0 Local

Preparing the data

pci_2021 <- pci_2021 %>% 
  left_join(provincial_boundaries %>% 
              select(province_code, geometry), 
            by = "province_code") %>% 
  st_as_sf()
Warning in left_join(., provincial_boundaries %>% select(province_code, : Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 11 of `x` matches multiple rows in `y`.
ℹ Row 2 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship =
  "many-to-many"` to silence this warning.

Fixed VS Adaptive Bandwidth

bw.fixed_project <- bw.gwr(formula = total_project_count ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="CV", 
                           kernel="boxcar", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 CV score: 222394245 
Fixed bandwidth: 598861.3 CV score: 253731784 
Fixed bandwidth: 1197409 CV score: 166709145 
Fixed bandwidth: 1338707 CV score: 171205597 
Fixed bandwidth: 1110082 CV score: 208175729 
Fixed bandwidth: 1251380 CV score: 171769613 
Fixed bandwidth: 1164053 CV score: 161667008 
Fixed bandwidth: 1143438 CV score: 162515312 
Fixed bandwidth: 1176794 CV score: 165824047 
Fixed bandwidth: 1156179 CV score: 161322847 
Fixed bandwidth: 1151312 CV score: 161298848 
Fixed bandwidth: 1148305 CV score: 160623353 
Fixed bandwidth: 1146446 CV score: 163009911 
Fixed bandwidth: 1149453 CV score: 160891544 
Fixed bandwidth: 1147595 CV score: 160623353 
gwr.fixed_project <- gwr.basic(formula = total_project_count ~ 
                                 `Entry Costs` +  `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_project, 
                               kernel = 'boxcar', 
                               longlat = FALSE)

gwr.fixed_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-11 01:20:13.434497 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.fixed_project, kernel = "boxcar", 
    longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: boxcar 
   Fixed bandwidth: 1147595 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                 Min.  1st Qu.   Median  3rd Qu.      Max.
   Intercept                 -2029.95  -637.88   596.48  1123.48 1882.8340
   `Entry Costs`              -749.00  -545.78  -493.11  -414.02 -281.8974
   `Land Access`              -201.59   218.75   264.32   435.01  548.4379
   Transparency               -658.25  -383.53  -360.58  -317.80    8.2392
   `Time Costs`                333.00   407.81   412.38   445.83  514.3275
   `Informal charges`         -671.17  -532.00  -452.95  -323.95    1.2336
   Proactivity                -289.04  -175.30  -142.30   -21.09   89.7491
   `Business Support Policy`   310.81   608.78   687.04   719.28  827.6244
   `Labor Policy`              589.33   765.44   834.32   923.34 1060.9729
   `Law & Order`             -1003.16  -724.69  -674.45  -636.49 -363.5316
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 13.16825 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 52.83175 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1163.596 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1139.972 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1115.975 
   Residual sum of squares: 100389487 
   R-square value:  0.4602377 
   Adjusted R-square value:  0.3231069 

   ***********************************************************************
   Program stops at: 2024-11-11 01:20:13.460021 
bw.adaptive_project <- bw.gwr(formula = total_project_count ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="CV", 
                              kernel="boxcar", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth: 48 CV score: 204424128 
Adaptive bandwidth: 38 CV score: 235355686 
Adaptive bandwidth: 56 CV score: 181731389 
Adaptive bandwidth: 59 CV score: 174435740 
Adaptive bandwidth: 63 CV score: 168499922 
Adaptive bandwidth: 63 CV score: 168499922 
gwr.adaptive_project <- gwr.basic(formula = total_project_count ~ 
                                    `Entry Costs` + `Land Access` + Transparency +
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_project,
                                  kernel = 'boxcar',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-11 01:20:13.508375 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.adaptive_project, kernel = "boxcar", 
    adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: boxcar 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                Min. 1st Qu.  Median 3rd Qu.     Max.
   Intercept                  214.29  214.29  971.88  971.88 1420.200
   `Entry Costs`             -575.15 -575.15 -561.30 -561.30 -463.273
   `Land Access`              221.39  221.39  289.30  353.31  353.307
   Transparency              -433.66 -353.89 -353.89 -333.83 -333.826
   `Time Costs`               359.70  414.10  414.10  468.73  468.725
   `Informal charges`        -487.83 -456.36 -456.36 -356.18 -356.180
   Proactivity               -187.82 -187.82 -140.26 -139.39  -48.202
   `Business Support Policy`  603.83  634.57  634.57  677.86  677.865
   `Labor Policy`             809.11  842.05  864.88  864.88  937.711
   `Law & Order`             -826.69 -737.49 -729.78 -637.65 -637.654
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 10.32786 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 55.67214 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1163.372 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1145.841 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1112.783 
   Residual sum of squares: 114549287 
   R-square value:  0.3841049 
   Adjusted R-square value:  0.267759 

   ***********************************************************************
   Program stops at: 2024-11-11 01:20:13.532168 
bw.fixed_capital <- bw.gwr(formula = total_registered_capital ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="CV", 
                           kernel="bisquare", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 CV score: 7783319735 
Fixed bandwidth: 598861.3 CV score: 8456546162 
Fixed bandwidth: 1197409 CV score: 6907612208 
Fixed bandwidth: 1338707 CV score: 6394763233 
Fixed bandwidth: 1426034 CV score: 6166490448 
Fixed bandwidth: 1480005 CV score: 6058968306 
Fixed bandwidth: 1513361 CV score: 6003759868 
Fixed bandwidth: 1533976 CV score: 5973544648 
Fixed bandwidth: 1546717 CV score: 5956273318 
Fixed bandwidth: 1554591 CV score: 5946105624 
Fixed bandwidth: 1559458 CV score: 5940013121 
Fixed bandwidth: 1562465 CV score: 5936321848 
Fixed bandwidth: 1564324 CV score: 5934068480 
Fixed bandwidth: 1565473 CV score: 5932686416 
Fixed bandwidth: 1566183 CV score: 5931836280 
Fixed bandwidth: 1566622 CV score: 5931312400 
Fixed bandwidth: 1566893 CV score: 5930989208 
Fixed bandwidth: 1567061 CV score: 5930789688 
Fixed bandwidth: 1567164 CV score: 5930666463 
Fixed bandwidth: 1567228 CV score: 5930590338 
Fixed bandwidth: 1567268 CV score: 5930543303 
Fixed bandwidth: 1567292 CV score: 5930514238 
Fixed bandwidth: 1567308 CV score: 5930496277 
Fixed bandwidth: 1567317 CV score: 5930485177 
Fixed bandwidth: 1567323 CV score: 5930478317 
Fixed bandwidth: 1567326 CV score: 5930474077 
Fixed bandwidth: 1567328 CV score: 5930471457 
Fixed bandwidth: 1567330 CV score: 5930469838 
Fixed bandwidth: 1567331 CV score: 5930468837 
Fixed bandwidth: 1567331 CV score: 5930468219 
Fixed bandwidth: 1567331 CV score: 5930467836 
Fixed bandwidth: 1567332 CV score: 5930467600 
Fixed bandwidth: 1567332 CV score: 5930467454 
Fixed bandwidth: 1567332 CV score: 5930467364 
Fixed bandwidth: 1567332 CV score: 5930467308 
Fixed bandwidth: 1567332 CV score: 5930467274 
Fixed bandwidth: 1567332 CV score: 5930467252 
Fixed bandwidth: 1567332 CV score: 5930467239 
Fixed bandwidth: 1567332 CV score: 5930467231 
Fixed bandwidth: 1567332 CV score: 5930467226 
Fixed bandwidth: 1567332 CV score: 5930467223 
Fixed bandwidth: 1567332 CV score: 5930467221 
Fixed bandwidth: 1567332 CV score: 5930467220 
Fixed bandwidth: 1567332 CV score: 5930467219 
Fixed bandwidth: 1567332 CV score: 5930467219 
Fixed bandwidth: 1567332 CV score: 5930467218 
Fixed bandwidth: 1567332 CV score: 5930467218 
Fixed bandwidth: 1567332 CV score: 5930467218 
Fixed bandwidth: 1567332 CV score: 5930467218 
Fixed bandwidth: 1567332 CV score: 5930467218 
gwr.fixed_capital <- gwr.basic(formula = total_registered_capital ~ 
                                 `Entry Costs` + `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_capital, 
                               kernel = 'bisquare', 
                               longlat = FALSE)

gwr.fixed_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-11 01:20:13.600976 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.fixed_capital, kernel = "bisquare", 
    longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: bisquare 
   Fixed bandwidth: 1567332 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                   Min.    1st Qu.     Median    3rd Qu.
   Intercept                 -39656.036 -37709.358 -36511.735 -35259.621
   `Entry Costs`              -7766.261  -6520.588  -4431.855  -2230.021
   `Land Access`                -43.236    692.711   2153.727   3568.962
   Transparency                -696.951   -242.575    385.228    604.899
   `Time Costs`                3700.055   4149.172   5788.037   7345.897
   `Informal charges`         -3779.846  -3549.856  -3434.954  -3209.836
   Proactivity                -4031.731  -3460.458  -2426.955   -890.564
   `Business Support Policy`   3464.883   4153.988   5602.217   5987.741
   `Labor Policy`              6390.413   6490.595   6778.570   7458.271
   `Law & Order`              -4128.646  -3918.244  -3182.281  -2150.035
                                  Max.
   Intercept                 -34262.35
   `Entry Costs`              -1482.41
   `Land Access`               4334.98
   Transparency                 721.93
   `Time Costs`                8232.04
   `Informal charges`         -2711.26
   Proactivity                 -297.88
   `Business Support Policy`   6071.96
   `Labor Policy`              8111.08
   `Law & Order`              -1610.82
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 17.98284 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 48.01716 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1406.215 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1377.055 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1360.196 
   Residual sum of squares: 3523822174 
   R-square value:  0.6522746 
   Adjusted R-square value:  0.5192787 

   ***********************************************************************
   Program stops at: 2024-11-11 01:20:13.625485 
bw.adaptive_capital <- bw.gwr(formula = total_registered_capital ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="CV", 
                              kernel="bisquare", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth: 48 CV score: 7552329470 
Adaptive bandwidth: 38 CV score: 8078226145 
Adaptive bandwidth: 56 CV score: 7245214023 
Adaptive bandwidth: 59 CV score: 6996709773 
Adaptive bandwidth: 63 CV score: 6680191144 
Adaptive bandwidth: 63 CV score: 6680191144 
gwr.adaptive_capital <- gwr.basic(formula = total_registered_capital ~ 
                                    `Entry Costs` + `Land Access` + Transparency + 
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_capital, 
                                  kernel = 'bisquare',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-11 01:20:13.671797 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.adaptive_capital, 
    kernel = "bisquare", adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: bisquare 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                  Min.   1st Qu.    Median   3rd Qu.       Max.
   Intercept                 -41074.10 -39734.01 -37621.80 -36852.31 -32049.862
   `Entry Costs`              -7951.74  -7491.97  -4447.79  -1403.80  -1145.634
   `Land Access`               -329.15   -286.28   2032.35   4158.50   4446.563
   Transparency                -750.65   -616.85    507.46    744.27    959.962
   `Time Costs`                3551.50   3800.16   6762.80   8192.55   8344.559
   `Informal charges`         -3997.21  -3449.04  -3218.23  -2800.56  -2663.778
   Proactivity                -4250.08  -3779.54  -2332.51   -180.23      0.763
   `Business Support Policy`   3210.83   3227.38   4896.51   5840.28   5912.261
   `Labor Policy`              5492.11   6446.88   6512.91   7886.25   8236.279
   `Law & Order`              -4258.17  -4181.16  -3330.83  -1394.81  -1358.757
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 22.038 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 43.962 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1413.956 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1374.151 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1368.891 
   Residual sum of squares: 3191366879 
   R-square value:  0.6850808 
   Adjusted R-square value:  0.5235383 

   ***********************************************************************
   Program stops at: 2024-11-11 01:20:13.697176 


Visualising Local R2

# Converting SDF into sf data.frame
pci_2021 <- st_as_sf(gwr.fixed_project$SDF) %>%
  st_transform(crs=3405)


gwr.fixed.output_project <- as.data.frame(gwr.fixed_project$SDF)
pci_2021.fixed_project <- cbind(pci_2021, as.matrix(gwr.fixed.output_project))

glimpse(pci_2021.fixed_project)
Rows: 66
Columns: 74
$ Intercept                       <dbl> -2029.95467, -932.37899, 894.71344, -6…
$ X.Entry.Costs.                  <dbl> -407.7129, -478.9765, -409.1094, -399.…
$ X.Land.Access.                  <dbl> 529.76677, -54.62115, -141.09552, 481.…
$ Transparency                    <dbl> -325.60467, -32.50034, -166.54523, -37…
$ X.Time.Costs.                   <dbl> 353.1721, 450.6535, 411.2222, 396.8354…
$ X.Informal.charges.             <dbl> -411.3981, -193.0164, -231.4254, -546.…
$ Proactivity                     <dbl> -153.73344, -12.23230, -14.30985, -161…
$ X.Business.Support.Policy.      <dbl> 732.5428, 412.1067, 339.5726, 707.5992…
$ X.Labor.Policy.                 <dbl> 828.7843, 664.5272, 677.9096, 852.7153…
$ X.Law...Order.                  <dbl> -713.1389, -467.7242, -469.8577, -724.…
$ y                               <dbl> 31, 595, 4, 15, 1820, 65, 99, 4073, 41…
$ yhat                            <dbl> -400.13304, 249.19696, 174.39859, 90.2…
$ residual                        <dbl> 431.13304, 345.80304, -170.39859, -75.…
$ CV_Score                        <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Stud_residual                   <dbl> 0.37323197, 0.28012187, -0.17698405, -…
$ Intercept_SE                    <dbl> 5290.157, 4717.062, 5085.611, 7461.838…
$ X.Entry.Costs._SE               <dbl> 508.1250, 419.8693, 446.7241, 585.5165…
$ X.Land.Access._SE               <dbl> 525.4821, 520.0565, 559.0493, 673.0689…
$ Transparency_SE                 <dbl> 404.8617, 430.3879, 462.4965, 432.1743…
$ X.Time.Costs._SE                <dbl> 380.3176, 417.3923, 485.3297, 426.4734…
$ X.Informal.charges._SE          <dbl> 441.0214, 491.4422, 594.5691, 520.9128…
$ Proactivity_SE                  <dbl> 522.4891, 455.9661, 515.8280, 557.3436…
$ X.Business.Support.Policy._SE   <dbl> 294.3359, 305.0581, 344.2616, 366.8382…
$ X.Labor.Policy._SE              <dbl> 402.7943, 349.9538, 374.2111, 433.6041…
$ X.Law...Order._SE               <dbl> 515.3410, 542.8703, 603.1861, 599.5543…
$ Intercept_TV                    <dbl> -0.383722975, -0.197660957, 0.17593036…
$ X.Entry.Costs._TV               <dbl> -0.8023870, -1.1407754, -0.9157988, -0…
$ X.Land.Access._TV               <dbl> 1.00815386, -0.10502926, -0.25238478, …
$ Transparency_TV                 <dbl> -0.80423678, -0.07551406, -0.36010053,…
$ X.Time.Costs._TV                <dbl> 0.9286239, 1.0796883, 0.8473048, 0.930…
$ X.Informal.charges._TV          <dbl> -0.9328302, -0.3927550, -0.3892321, -1…
$ Proactivity_TV                  <dbl> -0.29423281, -0.02682720, -0.02774152,…
$ X.Business.Support.Policy._TV   <dbl> 2.4887983, 1.3509118, 0.9863797, 1.928…
$ X.Labor.Policy._TV              <dbl> 2.057587, 1.898900, 1.811570, 1.966576…
$ X.Law...Order._TV               <dbl> -1.3838194, -0.8615763, -0.7789598, -1…
$ Local_R2                        <dbl> 0.3754886, 0.4586589, 0.5009512, 0.391…
$ Intercept.1                     <named list> -2029.955, -932.379, 894.7134, …
$ X.Entry.Costs..1                <named list> -407.7129, -478.9765, -409.1094…
$ X.Land.Access..1                <named list> 529.7668, -54.62115, -141.0955,…
$ Transparency.1                  <named list> -325.6047, -32.50034, -166.5452…
$ X.Time.Costs..1                 <named list> 353.1721, 450.6535, 411.2222, 3…
$ X.Informal.charges..1           <named list> -411.3981, -193.0164, -231.4254…
$ Proactivity.1                   <named list> -153.7334, -12.2323, -14.30985,…
$ X.Business.Support.Policy..1    <named list> 732.5428, 412.1067, 339.5726, 7…
$ X.Labor.Policy..1               <named list> 828.7843, 664.5272, 677.9096, 8…
$ X.Law...Order..1                <named list> -713.1389, -467.7242, -469.8577…
$ y.1                             <named list> 31, 595, 4, 15, 1820, 65, 99, 4…
$ yhat.1                          <named list> -400.133, 249.197, 174.3986, 90…
$ residual.1                      <named list> 431.133, 345.803, -170.3986, -7…
$ CV_Score.1                      <named list> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ Stud_residual.1                 <named list> 0.373232, 0.2801219, -0.176984,…
$ Intercept_SE.1                  <named list> 5290.157, 4717.062, 5085.611, 7…
$ X.Entry.Costs._SE.1             <named list> 508.125, 419.8693, 446.7241, 58…
$ X.Land.Access._SE.1             <named list> 525.4821, 520.0565, 559.0493, 6…
$ Transparency_SE.1               <named list> 404.8617, 430.3879, 462.4965, 4…
$ X.Time.Costs._SE.1              <named list> 380.3176, 417.3923, 485.3297, 4…
$ X.Informal.charges._SE.1        <named list> 441.0214, 491.4422, 594.5691, 5…
$ Proactivity_SE.1                <named list> 522.4891, 455.9661, 515.828, 55…
$ X.Business.Support.Policy._SE.1 <named list> 294.3359, 305.0581, 344.2616, 3…
$ X.Labor.Policy._SE.1            <named list> 402.7943, 349.9538, 374.2111, 4…
$ X.Law...Order._SE.1             <named list> 515.341, 542.8703, 603.1861, 59…
$ Intercept_TV.1                  <named list> -0.383723, -0.197661, 0.1759304…
$ X.Entry.Costs._TV.1             <named list> -0.802387, -1.140775, -0.915798…
$ X.Land.Access._TV.1             <named list> 1.008154, -0.1050293, -0.252384…
$ Transparency_TV.1               <named list> -0.8042368, -0.07551406, -0.360…
$ X.Time.Costs._TV.1              <named list> 0.9286239, 1.079688, 0.8473048,…
$ X.Informal.charges._TV.1        <named list> -0.9328302, -0.392755, -0.38923…
$ Proactivity_TV.1                <named list> -0.2942328, -0.0268272, -0.0277…
$ X.Business.Support.Policy._TV.1 <named list> 2.488798, 1.350912, 0.9863797, …
$ X.Labor.Policy._TV.1            <named list> 2.057587, 1.8989, 1.81157, 1.96…
$ X.Law...Order._TV.1             <named list> -1.383819, -0.8615763, -0.77895…
$ Local_R2.1                      <named list> 0.3754886, 0.4586589, 0.5009512…
$ geometry.1                      <named list> [MULTIPOLYGON (((519993.2 12...…
$ geometry                        <MULTIPOLYGON [m]> MULTIPOLYGON (((519993.2 …
# Set tmap options to check and fix any invalid polygons
tmap_options(check.and.fix = TRUE)

tmap_mode("view")
tmap mode set to interactive viewing
str(pci_2021.fixed_project$Local_R2)
 num [1:66] 0.375 0.459 0.501 0.392 0.45 ...
pci_2021.fixed_project$Local_R2 <- unlist(pci_2021.fixed_project$Local_R2)


tm_shape(provincial_boundaries)+
  tm_polygons(alpha = 0.1) +
tm_shape(pci_2021.fixed_project) +  
  tm_polygons(col = "Local_R2",
          border.col = "gray60",
          border.lwd = 1) +
  tm_view(set.zoom.limits = c(5,8))
Warning: The shape provincial_boundaries is invalid (after reprojection). See
sf::st_is_valid
Warning: The shape pci_2021.fixed_project is invalid (after reprojection). See
sf::st_is_valid
# Converting SDF into sf data.frame
pci_2021 <- st_as_sf(gwr.adaptive_capital$SDF) %>%
  st_transform(crs=3405)


gwr.adaptive.output_capital <- as.data.frame(gwr.adaptive_capital$SDF)
pci_2021.adaptive_capital <- cbind(pci_2021, as.matrix(gwr.adaptive.output_capital))

glimpse(pci_2021.adaptive_capital)
Rows: 66
Columns: 74
$ Intercept                       <dbl> -36357.19, -39500.61, -39839.53, -3680…
$ X.Entry.Costs.                  <dbl> -7683.232, -1420.801, -1339.037, -7873…
$ X.Land.Access.                  <dbl> 4319.8435, -299.3851, -319.7592, 4399.…
$ Transparency                    <dbl> -665.91049, 794.51327, 753.97957, -730…
$ X.Time.Costs.                   <dbl> 8266.566, 3794.920, 3697.926, 8322.321…
$ X.Informal.charges.             <dbl> -2735.104, -3462.981, -3347.872, -2687…
$ Proactivity                     <dbl> -4111.6141396, -221.8046909, -182.9868…
$ X.Business.Support.Policy.      <dbl> 5912.261, 3218.825, 3231.670, 5866.189…
$ X.Labor.Policy.                 <dbl> 8019.760, 6507.622, 6489.537, 8176.664…
$ X.Law...Order.                  <dbl> -4236.344, -1369.919, -1383.720, -4160…
$ y                               <dbl> 317.31, 9408.00, 7.90, 4496.04, 23317.…
$ yhat                            <dbl> 3102.7876, 3251.9995, 1321.5255, -957.…
$ residual                        <dbl> -2785.47762, 6156.00054, -1313.62546, …
$ CV_Score                        <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Stud_residual                   <dbl> -0.457111938, 0.916567017, -0.23380761…
$ Intercept_SE                    <dbl> 41159.58, 31226.01, 31673.08, 41965.30…
$ X.Entry.Costs._SE               <dbl> 3467.347, 2747.628, 2789.168, 3505.653…
$ X.Land.Access._SE               <dbl> 3903.781, 3521.329, 3592.957, 3965.241…
$ Transparency_SE                 <dbl> 2632.882, 2933.076, 2986.650, 2659.684…
$ X.Time.Costs._SE                <dbl> 2598.653, 2912.814, 2974.442, 2618.904…
$ X.Informal.charges._SE          <dbl> 3115.716, 3501.277, 3617.090, 3153.564…
$ Proactivity_SE                  <dbl> 3349.085, 3091.690, 3133.565, 3385.799…
$ X.Business.Support.Policy._SE   <dbl> 2132.031, 2034.294, 2078.796, 2156.493…
$ X.Labor.Policy._SE              <dbl> 2549.012, 2279.120, 2313.595, 2578.758…
$ X.Law...Order._SE               <dbl> 3516.913, 3713.756, 3820.423, 3552.740…
$ Intercept_TV                    <dbl> -0.8833225, -1.2649906, -1.2578360, -0…
$ X.Entry.Costs._TV               <dbl> -2.2158823, -0.5171011, -0.4800848, -2…
$ X.Land.Access._TV               <dbl> 1.10657936, -0.08502047, -0.08899612, …
$ Transparency_TV                 <dbl> -0.25292076, 0.27088054, 0.25244996, -…
$ X.Time.Costs._TV                <dbl> 3.181097, 1.302836, 1.243233, 3.177788…
$ X.Informal.charges._TV          <dbl> -0.8778413, -0.9890621, -0.9255707, -0…
$ Proactivity_TV                  <dbl> -1.227682827, -0.071742205, -0.0583957…
$ X.Business.Support.Policy._TV   <dbl> 2.773065, 1.582281, 1.554588, 2.720245…
$ X.Labor.Policy._TV              <dbl> 3.146222, 2.855323, 2.804959, 3.170775…
$ X.Law...Order._TV               <dbl> -1.2045632, -0.3688769, -0.3621902, -1…
$ Local_R2                        <dbl> 0.7322584, 0.6414571, 0.6553979, 0.741…
$ Intercept.1                     <named list> -36357.19, -39500.61, -39839.53…
$ X.Entry.Costs..1                <named list> -7683.232, -1420.801, -1339.037…
$ X.Land.Access..1                <named list> 4319.844, -299.3851, -319.7592,…
$ Transparency.1                  <named list> -665.9105, 794.5133, 753.9796, …
$ X.Time.Costs..1                 <named list> 8266.566, 3794.92, 3697.926, 83…
$ X.Informal.charges..1           <named list> -2735.104, -3462.981, -3347.872…
$ Proactivity.1                   <named list> -4111.614, -221.8047, -182.9869…
$ X.Business.Support.Policy..1    <named list> 5912.261, 3218.825, 3231.67, 58…
$ X.Labor.Policy..1               <named list> 8019.76, 6507.622, 6489.537, 81…
$ X.Law...Order..1                <named list> -4236.344, -1369.919, -1383.72,…
$ y.1                             <named list> 317.31, 9408, 7.9, 4496.04, 233…
$ yhat.1                          <named list> 3102.788, 3251.999, 1321.525, -…
$ residual.1                      <named list> -2785.478, 6156.001, -1313.625,…
$ CV_Score.1                      <named list> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ Stud_residual.1                 <named list> -0.4571119, 0.916567, -0.233807…
$ Intercept_SE.1                  <named list> 41159.58, 31226.01, 31673.08, 4…
$ X.Entry.Costs._SE.1             <named list> 3467.347, 2747.628, 2789.168, 3…
$ X.Land.Access._SE.1             <named list> 3903.781, 3521.329, 3592.957, 3…
$ Transparency_SE.1               <named list> 2632.882, 2933.076, 2986.65, 26…
$ X.Time.Costs._SE.1              <named list> 2598.653, 2912.814, 2974.442, 2…
$ X.Informal.charges._SE.1        <named list> 3115.716, 3501.277, 3617.09, 31…
$ Proactivity_SE.1                <named list> 3349.085, 3091.69, 3133.565, 33…
$ X.Business.Support.Policy._SE.1 <named list> 2132.031, 2034.294, 2078.796, 2…
$ X.Labor.Policy._SE.1            <named list> 2549.012, 2279.12, 2313.595, 25…
$ X.Law...Order._SE.1             <named list> 3516.913, 3713.756, 3820.423, 3…
$ Intercept_TV.1                  <named list> -0.8833225, -1.264991, -1.25783…
$ X.Entry.Costs._TV.1             <named list> -2.215882, -0.5171011, -0.48008…
$ X.Land.Access._TV.1             <named list> 1.106579, -0.08502047, -0.08899…
$ Transparency_TV.1               <named list> -0.2529208, 0.2708805, 0.25245,…
$ X.Time.Costs._TV.1              <named list> 3.181097, 1.302836, 1.243233, 3…
$ X.Informal.charges._TV.1        <named list> -0.8778413, -0.9890621, -0.9255…
$ Proactivity_TV.1                <named list> -1.227683, -0.0717422, -0.05839…
$ X.Business.Support.Policy._TV.1 <named list> 2.773065, 1.582281, 1.554588, 2…
$ X.Labor.Policy._TV.1            <named list> 3.146222, 2.855323, 2.804959, 3…
$ X.Law...Order._TV.1             <named list> -1.204563, -0.3688769, -0.36219…
$ Local_R2.1                      <named list> 0.7322584, 0.6414571, 0.6553979…
$ geometry.1                      <named list> [MULTIPOLYGON (((519993.2 12...…
$ geometry                        <MULTIPOLYGON [m]> MULTIPOLYGON (((519993.2 …
# Set tmap options to check and fix any invalid polygons
tmap_options(check.and.fix = TRUE)

tmap_mode("view")
tmap mode set to interactive viewing
str(pci_2021.adaptive_capital$Local_R2)
 num [1:66] 0.732 0.641 0.655 0.741 0.642 ...
pci_2021.adaptive_capital$Local_R2 <- unlist(pci_2021.adaptive_capital$Local_R2)


tm_shape(provincial_boundaries)+
  tm_polygons(alpha = 0.1) +
tm_shape(pci_2021.adaptive_capital) +  
  tm_polygons(col = "Local_R2",
          border.col = "gray60",
          border.lwd = 1) +
  tm_view(set.zoom.limits = c(5,8))
Warning: The shape provincial_boundaries is invalid (after reprojection). See
sf::st_is_valid
Warning: The shape pci_2021.adaptive_capital is invalid (after reprojection).
See sf::st_is_valid


Visualising coefficient estimates

tmap_mode("view")
tmap mode set to interactive viewing
AREA_SQM_SE <- tm_shape(provincial_boundaries)+
  tm_polygons(alpha = 0.1) +
tm_shape(pci_2021.fixed_project) +  
  tm_polygons(col = "Transparency_SE",
          border.col = "gray60",
          border.lwd = 1) +
  tm_view(set.zoom.limits = c(5,8))

AREA_SQM_TV <- tm_shape(provincial_boundaries)+
  tm_polygons(alpha = 0.1) +
tm_shape(pci_2021.adaptive_capital) +  
  tm_polygons(col = "Transparency_TV",
          border.col = "gray60",
          border.lwd = 1) +
  tm_view(set.zoom.limits = c(5,8))

tmap_arrange(AREA_SQM_SE, AREA_SQM_TV, 
             asp=1, ncol=2,
             sync = TRUE)
Warning: The shape provincial_boundaries is invalid (after reprojection). See
sf::st_is_valid
Warning: The shape pci_2021.fixed_project is invalid (after reprojection). See
sf::st_is_valid
Warning: The shape provincial_boundaries is invalid (after reprojection). See
sf::st_is_valid
Warning: The shape pci_2021.adaptive_capital is invalid (after reprojection).
See sf::st_is_valid
Variable(s) "Transparency_TV" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.



6.0 Shiny Storyboard

6.1 Global Explanatory Model

6.1.1 Multiple Linear Regression (MLR)

Multiple Linear Regression (MLR) provides a baseline model to predict an outcome using multiple predictor variables. It assumes a consistent, global relationship across all data points, offering a broad understanding of how these variables influence the dependent variable overall.

Users have the flexibility to select specific combinations of predictor variables (e.g., PCI factors) to explore different modeling outcomes and compare how these combinations affect results.

Interpretation: A higher Adjusted R2 indicates that the model more effectively explains the variation in the outcome. Additionally, p-value reflects the statistical significance of the model, with lower values indicating stronger confidence in the predictors’ influence.


6.1.2 Stepwise Model Selection

Stepwise Model Selection allows users to refine the Multiple Linear Regression model by adding or removing predictor variables in a systematic way.

Users can select from 3 approaches:

  • forward selection (starting with no predictors and adding them),

  • backward elimination (starting with all predictors and removing them),

  • or both (a combination of adding and removing).

Users also have control over the confidence level (e.g., 0.95 or 0.99), adjusting the stringency for including predictors based on statistical significance.

Interpretation and Visualization: A radar chart provides a visual comparison of different models, allowing users to select and view how models perform across chosen predictors and confidence levels. This helps in identifying the most effective model based on the desired balance of predictors and statistical robustness.


6.1.3 Visualise Model Parameters

Visualize Model Parameters offers users an interactive way to examine and compare the effects of predictor variables across different stepwise models. Users can select their preferred model from the stepwise selection results and sort parameter values in ascending or descending order for easier comparison.

Interpretation and Customization: This visualization provides a clear view of each predictor’s influence on the outcome variable within the selected model, helping users assess the relative importance of predictors. By sorting the parameters, users can quickly identify the most impactful variables or spot subtle differences across models, aiding in deeper analysis and model refinement.


6.2 Local Explanatory Model

6.2.1 Bandwidth Selection

Bandwidth Selection in Geographically Weighted Regression (GWR) allows users to fine-tune the model’s spatial sensitivity by choosing between fixed and adaptive bandwidth options.

  • Side-by-Side Comparison:
    Users can compare the effects of fixed vs. adaptive bandwidths side-by-side. This comparison highlights how each bandwidth type influences the spatial scale of the analysis:

    • Fixed bandwidth applies a constant spatial radius for all data points, which is ideal for evenly spaced data.

    • Adaptive bandwidth adjusts based on data density, using a larger bandwidth in sparse areas and a smaller one in dense areas, enhancing accuracy in regions with varying data distributions.

  • Selection Options for Each Bandwidth Type:

    • Approach: Users can choose from cross-validation or A/C corrected methods to optimize the bandwidth.

    • Kernel Method: Users can select the kernel type (e.g., Gaussian, bisquare, or tricube) to define the shape and weighting of spatial influence around each data point, tailoring the model to the spatial structure of the data.

Purpose:
This flexible bandwidth selection process allows users to determine the optimal balance of local vs. global influence, enhancing model accuracy and providing insights into spatial patterns at different scales.


6.2.2 Visualise Local R2

Visualizing Local R2 provides an in-depth look at how well the Geographically Weighted Regression (GWR) model explains variations in the outcome across different areas. This visualization highlights spatial differences in model performance, making it easier to identify regions where predictors more effectively capture local patterns.

  • Customization Options:

    • Bandwidth Type: Choose between fixed or adaptive bandwidth to control the scale of spatial influence.

    • Bandwidth Optimization Approach: Select cross-validation or A/C corrected to determine the optimal bandwidth, allowing a focus on either predictive accuracy or model simplicity.

    • Kernel Method: Choose the kernel type (e.g., Gaussian, bisquare, or tricube) to set the shape and weighting of spatial influence, refining how local R2 values are calculated across locations.


This flexible visualization helps users assess where the model has strong or weak explanatory power across the study area, offering insights into local model fit. By adjusting bandwidth, approach, and kernel, users can explore how these choices impact model performance, identifying areas with robust predictions and areas needing further investigation.


6.5 Spatial Non-Stationary Assumption

The Spatial Non-Stationarity Assumption tool helps assess whether relationships between variables vary across different spatial locations. In this analysis, it tests whether the influence of Provincial Competitiveness Index (PCI) dimensions on FDI metrics (Total Number of Projects and Total Registered Capital) remains consistent across provinces or if it changes depending on regional characteristics.